Learning from partially labelled data—with confidence
نویسندگان
چکیده
In this paper, we propose a unifying treatment of several strategies for training mixture models from label-deficient data. After a review of different approaches to estimating classification models on partially labelled data using mixture models, we identify a number of problems which lead us to propose a new EM variant. The aim is to better handle unlabelled data and provide a more confident discrimination decision. This is illustrated by an experimental comparison of the different models on the Leptograpsus crab data.
منابع مشابه
Markov Blanket Discovery in Positive-Unlabelled and Semi-supervised Data
The importance of Markov blanket discovery algorithms is twofold: as the main building block in constraint-based structure learning of Bayesian network algorithms and as a technique to derive the optimal set of features in filter feature selection approaches. Equally, learning from partially labelled data is a crucial and demanding area of machine learning, and extending techniques from fully t...
متن کاملWeakly Supervised Learning Algorithms and an Application to Electromyography
In the standard machine learning framework, training data is assumed to be fully supervised. However, collecting fully labelled data is not always easy. Due to cost, time, effort or other types of constraints, requiring the whole data to be labelled can be difficult in many applications, whereas collecting unlabelled data can be relatively easy. Therefore, paradigms that enable learning from un...
متن کاملLearning Discriminative Sequence Models from Partially Labelled Data for Activity Recognition
Recognising daily activity patterns of people from low-level sensory data is an important problem. Traditional approaches typically rely on generative models such as the hidden Markov models and training on fully labelled data. While activity data can be readily acquired from pervasive sensors, e.g. in smart environments, providing manual labels to support fully supervised learning is often exp...
متن کاملMMDT: Multi-Objective Memetic Rule Learning from Decision Tree
In this article, a Multi-Objective Memetic Algorithm (MA) for rule learning is proposed. Prediction accuracy and interpretation are two measures that conflict with each other. In this approach, we consider accuracy and interpretation of rules sets. Additionally, individual classifiers face other problems such as huge sizes, high dimensionality and imbalance classes’ distribution data sets. This...
متن کاملComposite Kernel Optimization in Semi-Supervised Metric
Machine-learning solutions to classification, clustering and matching problems critically depend on the adopted metric, which in the past was selected heuristically. In the last decade, it has been demonstrated that an appropriate metric can be learnt from data, resulting in superior performance as compared with traditional metrics. This has recently stimulated a considerable interest in the to...
متن کامل